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Abstract. A permutation sequence (<7„)„£N is said to be convergent if, for every fixed 
permutation r, the density of occurrences of r in the elements of the sequence converges. 
We prove that such a convergent sequence has a natural limit object, namely a Lebesgue 
measurable function Z : [0, l] 2 — > [0, 1] with the additional properties that, for every fixed 
x £ [0, 1], the restriction Z(x, •) is a cumulative distribution function and, for every y £ [0, 1], 
the restriction Z(-,y) satisfies a "mass" condition. This limit process is well-behaved: every 
function in the class of limit objects is a limit of some permutation sequence, and two of these 
functions are limits of the same sequence if and only if they are equal almost everywhere. An 
ingredient in the proofs is a new model of random permutations, which generalizes previous 
models and might be interesting for its own sake. 



1. Introduction 

As usual, a permutation of a finite set A is a bijective function of X into itself. We shall 
focus on permutations a on the set X = {1, . . . , n} = [n], where n is a positive integer, called 
the length of a, and is denoted by \a\. In this work, a permutation a on [n] is represented by 
a = (<r(l), • • • ,a(n)), and the set of all permutations on [n] is denoted by S n . We denote by 
S = Ui^i Sn the set of all finite permutations. A graph G = (V, E) is given by its vertex set 
V and its edge set E C v} CV : u ^ f}. 

The main goal of this paper is to introduce a notion of convergence of a permutation 
sequence (<7 n ) n£ N and to identify a natural limit object for such a convergent sequence whose 
associated sequence of lengths (|<r n |) ne N tends to infinity. Lovasz and Szegedy [21] were 
concerned with these questions in the case of graph sequences (G n ) n£ N- This has been further 
investigated by Borgs et el. in [4] and [5], where, among other things, limits of graph sequences 
were used to characterize the testability of graph parameters. The convergence of sequences 
of combinatorial objects has also been addressed in other structures. For instance, graphs 
with degrees bounded by a constant have been addressed in the recent works of Benjamini 
and Schramm [2] and of Elek ([10], [11]). Elek and Szegedy [12] studied this problem for 
hypergraphs. See Lovasz [19] for a comprehensive survey of this area. 

Currently, the main application of our results in this paper is in property testing of per- 
mutations. Roughly speaking, the objective of testing is to decide whether a combinatorial 
structure satisfies some property, or to estimate the value of some numerical function associ- 
ated with this combinatorial structure, by considering only a randomly chosen substructure of 
sufficiently large, but constant size. These problems are called property testing and parameter 
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testing, respectively; a property or parameter is said to be testable if it can be estimated 
accurately in this way. The algorithmic appeal of testability is evident, as, conditional on 
sampling, this leads to reliable constant-time randomized estimators for the said properties or 
parameters. In [16] four of the present authors address these questions through the prism of 
subpermutations. Among their main results are a permutation result in the direction of Alon 
and Shapira's [1] work on the testability of hereditary graph properties, and a permutation 
counterpart of the characterization of testable parameters by Borgs et al. [6j. 

Given the similarity of our results with the ones obtained in [21], we briefly describe that 
work. Central in the arguments is the notion of a homomorphism of a graph F into a graph 
G, a function <j) : V{F) — > V(G) that maps the vertex set V(F) of F into the vertex set 
V(G) of G with the property that, for every edge {u, v} in F, the pair {(p(u) , <p(v)} is an 
edge in G. The number of homomorphisms of F into G is denoted by hom(i ? , G), while the 
homomorphism density of F into G is given by the probability that a uniformly chosen (f> is 
a homomorphism: 

tfFn _ hom(F,G) 

t{F > G) ~ \V(G)\V(F)\- (1) 

It is natural to measure the similarity between two graphs G and G' by comparing the 
homomorphism density of different graphs F into them. This suggests defining a graph 
sequence (G n ) nS N as being convergent if, for every (simple) graph F, the sequence of real 
numbers (t(F, G n )) nS N converges. Lovasz and Szegedy identify a class of natural limit objects 
for such convergent sequences, which they call graphons, in the form of symmetric Lebesgue 
measurable functions W : [0, l] 2 — > [0,1]. One might heuristically imagine the adjacency 
matrix of a simple graph as a black-and-white television screen (a white pixel at position 
(i,j) represents an edge between vertices i and j), a graph sequence as a sequence of TV sets 
with higher and higher resolution, and the limiting graphon W as the "perfect TV" where 
each point (x,y) € [0, l] 2 is a "pixel of infinitesimal size" and the color of each infinitesimal 
pixel can be any shade of grey between black and white. 

An important feature of this limit object is that it may be used to generate random graphs: 
given a graphon W : [0, l] 2 — > [0,1] and a positive integer n, a W -random graph G(n,W) 
with vertex set [n] is generated as follows. First, n real numbers X±,... ,X n are generated 
independently according to the uniform probability distribution on the interval [0, 1]. Then, 
for every pair of distinct vertices i and j in [n], the pair {i,j} is added to the edge set of 
the graph independently with probability W(Xi,Xj). Heuristically, the adjacency matrix of 
G(n, W) looks like the graphon W if 1 <C n: we approximate the perfect TV screen by choosing 
the coordinates of infinitesimal pixels of W at random and declaring a pixel of G(n, W) white 
with a probability proportional to the greyness of the corresponding infinitesimal pixel of W . 
It is important to point out that this model of random graphs generalizes the random graph 
model G(n, H) (see Lovasz and Sos [20]), further generalizing the classical model G ntP due to 
Erdos-Renyi [T3] and Gilbert [H]. 

With this, Lovasz and Szegedy define the homomorphism density of a A:-vertex graph F 
into a graphon W as the probability t(F, W) that F is a subgraph of the VF-random graph 
G{k, W), which can be calculated as follows given the above definition of G(k, W): 

t(F,W) = [ . . . C H W(x i ,x j )dx 1 ...dx k . (2) 
Jo Jo {i,j}eE(F) 

They use this to prove that, with each convergent graph sequence (G n ) ne N, one may asso- 
ciate a graphon W such that, for every fixed F, 

lim t(F,G n )=t(F,W). 



LIMITS OF PERMUTATION SEQUENCES 



3 



The graphon W is said to be a limit to (G n ) n& ^. Conversely, Lovasz and Szegedy show- 
that, for any fixed graphon W, the randomized sequence (G(n, W)) n ^ converges to W with 
probability one. Hence, given a graphon W, there exists a graph sequence (G n (W)) n ^ 
converging to W. 

In our paper, a similar path is traced for permutation sequences (a n ) n ^. The role of the 
homomorphism density t(F, G n ) of a fixed graph F into G n is played here by the subpermu- 
tation density t{r,a n ) of a fixed permutation r into a n , which we now define. By [n]< we 
mean the set of m-tuples in [n] whose elements are in strictly increasing order. 

Definition 1.1 (Subpermutation density). For positive integers k,n € N, let r € Sk and 
vr 6 S n . The number of occurrences A(r, it) of the permutation t in ir is the number of It- 
tuples (xi,X2, ■ ■ ■ ,Xk) £ [n]< such that ir(xi) < vr(xj) if and only if r(i) < r(j). The density 
of the permutation r as a subpermutation of ir is given by 



As an illustration, the permutation r = (3,1,4,2) occurs in tt = (5,6,2,4,7,1,3), since tt 
maps the index set (1, 3, 5, 7) onto (5, 2, 7, 3), which appears in the relative order given by r. 
This concept may be used to define a convergent permutation sequence in a natural way. 

Definition 1.2 (Convergence of a permutation sequence). A permutation sequence (o- n )n£N 
is convergent if, for every fixed permutation t, the sequence of real numbers (t(r, <7 n ))neN 
converges. 

The interesting case occurs when the sequence of lengths (|c n |) ne ^ tends to infinity, since, as 
we shall see, every convergent permutation sequence (<r n ) n6 N is otherwise eventually constant. 
We prove that, when \a n \ — > oo, any convergent permutation sequence has a natural limit 
object, called a limit permutation (all asymptotics in this paper are with respect to n — > oo). 
This limit object consists of a family of cumulative distribution functions, or cdf for short. 
We say that a function F : [0, 1] — > [0, 1] is a cdf if 

F is a non-decreasing and right-continuous function with F(0) > and F(l) = 1. (4) 

Note that F is a cdf if and only if there is a [0, 1] -valued random variable Y such that, for 
every y G [0, 1], we have F{y) = P [Y < y). 

Definition 1.3 (Limit permutation). A limit permutation is a Lebesgue measurable function 
Z : [0, l] 2 — > [0, 1] satisfying the following conditions: 

(a) for every x 6 [0, 1], the function Z(x, •) is a cdf, i.e., (|4|) holds; 

(b) for every y G [0,1], the function Z(-,y) satisfies j^Z{x,y)dx = y. 
We denote the set of limit permutations by Z. 

In fact, limit permutations are in one-to-one correspondence with probability measures on 
the unit square whose marginals are uniformly distributed in [0, 1], as discussed below. The 
heuristic picture can again be imagined using TV screens: a permutation n € S n might be 
represented by the square matrix A a = where 8ij denotes Dirac's delta function. 

Again, if A(i,j) = 1, we say that the corresponding pixel is white. A TV screen obtained from 
a permutation has the property that the brightness of each row and each column is the same, 
since each row and column has a unique white pixel in it. Now imagine a convergent sequence 
of permutations as a sequence of TV screens showing the same image with higher and higher 
resolution, and the limit permutation as the "perfect TV" . The brightness distribution of the 
limit object can be represented by a probability measure \x on [0, l] 2 and inherits the property 
that the total brightness of each "row" and each "column" is the same, i.e., the marginal 




(3) 
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distributions of \i are uniform on [0,1]. Consider the [0, l] 2 -valued random variable! (X, Y) 
with distribution fi. Denote by Z : [0, l] 2 — > [0, 1] the regular conditional distribution function 
of Y given X (for a formal definition of this concept, see Shiryaev [23] and Lemma 12.21 
below; heuristically, we may think of Z(x,-) as the cdf of Y under the condition X = x). 
The function Z will satisfy Definition 11.3( b) because X and Y are uniformly distributed 
on [0, 1]. The fact that we may think of limit permutations as regular conditional distribution 
functions Z, or as probability distributions fi, or as random variables (X, Y) as above will be 
useful in what follows. 

As with graphs, limit permutations may be used to define a model of random permutations. 

Definition 1.4 (Z-random permutation). Given a limit permutation Z we generate the Z- 
random permutation cr(n, Z) using the following. A sequence of n real numbers X\, . . . ,X n 
is generated independently and uniformly on [0, 1]. Conditional on (X\, . . . ,X n ), we generate 
n real numbers Y±, . . . ,Y n , with each Yi generated independently according to the cdf Z(X{, •). 
Let (X*, . . . XI) and {Y\i ■ ■ ■ Y£) denote the values Xx, . . . , X/- and Y±, . . . , Y}., respectively, 
rearranged in increasing order. The Z -random permutation a = o~(n, Z) is given by a{i) = j 
if and only if X* = Xg and Y? = Y^ for some £ £ [n] . 

In other words, the Z-random permutation a = cr(n, Z) is given by the relative order of 
the vertical coordinates of the points (Xi,Y\), . . . , (X n ,Y n ) with respect to their horizontal 
coordinates (we shall see later that, with probability one, Xj ^ Xj and Yi ^ Yj if % ^ j £ [n]). 

This new model of random permutation generalizes the classical random permutation 
model, in which a permutation is selected uniformly at random from all permutations on 
[n]. Indeed, a classical random permutation may be obtained as a Z-random permutation for 
the uniform limit permutation Z u , where Z u (x,y) = y for all (x,y) € [0, l] 2 . The distribution 
of the corresponding (X,Y) is uniform on [0, l] 2 , i.e., X and Y are independent and uniform 
on [0,1]. 

Again inspired by the graph case, given a limit permutation Z, we may define the subper- 
mutation density t(r, Z) of a permutation r on [k] in Z as the probability that the Z-random 
permutation a(k, Z) is equal to r. 

Definition 1.5. Let (o~n)n£N be a sequence of permutations such that \a n \ — > oo. Let Z G Z. 
We say that (a n ) n ^ converges to Z, or briefly write a n — > Z , if 

VrG5: lim tOr,a n ) =t(r,Z). (5) 

n— >-oo 

Note that the assumption | a n \ — > oo is quite natural: we prove in Claim [2^41 that if (cr n ) ng N 
converges and (|cn|)raeN has a bounded subsequence then the sequence (o" n ) nS N is eventually 
constant. 

Theorem 1.6 (Main result). 

(i) Given a convergent permutation sequence (o" n ,) n6 N for which \a n \ — > oo, there exists a 
limit permutation Z such that a n — > Z holds. 

(ii) Conversely, every Z € Z is a limit of a convergent permutation sequence, i.e., there 
is a sequence (o" n ) n£ N such that a n —¥ Z holds. 

Theorem 11.61 naturally raises the question of characterizing the limit permutations that are 
limits to the same given convergent permutation sequence (<r n ) n6 N- It is clear that this limit 
is not unique in a strict sense, because, if a n — > Z and A is a measurable subset of [0, 1] with 
measure zero, then any limit permutation obtained through the replacement of the cdf Z(x, •) 
by a cdf Z*(x, •) for every x € A is also a limit of (<r ri ) ne N, as the value of the probabilities 
corresponding to t(r,Z) and t(r,Z*) will be identical. This question has also been raised 
in the case of graph limits in [3], where uniqueness was captured by an equivalence relation 
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induced by a pseudometric c?n between graphons. Two graphons W and W* were proved to 
be limits of the same graph sequence if and only if d\j{W, W*) = 0. The case of permutations 
is simpler. 

Theorem 1.7. Let Z\,Zi £ Z. Let (cr n )n£N be a convergent permutation sequence. Then 
a n — > Z\ and a n — > Z2 if and only if the set {x : Z±(x, •) ^ ^(x, •)} has Lebesgue measure 
zero. 

Recall that limit permutations Z may be viewed as certain probability distributions fi or 
as certain random variables (X,Y) (see the discussion just after Definition 1 1 .3j) . Theorem II .71 
implies that limit permutations are unique when viewed as such probability distributions or 
random variables. 

Based on a previous concept by Cooper [8], we may introduce a distance do between per- 
mutations (see (|3ip ) and, more generally, between limit permutations, which is a permutation 
counterpart of the graph pseudometric discussed in the previous paragraph. In particular, we 
may characterize our notion of convergence of permutation sequences in terms of this metric. 
As usual, a sequence (cr n ) ng N is said to be a Cauchy sequence with respect to the metric c?n 
if, for every e > 0, there exits uq = n${e) such that d\j(a n , a m ) < e for every n, m > uq. 

Theorem 1.8. A permutation sequence (cr n )neN converges if and only if it is a Cauchy se- 
quence with respect to the metric d\j. 

Also in analogy with the work for graphs by Lovasz and Szegedy, the theory in this paper 
can be considered in terms of the discrete metric space (<S, dn), where S = {J°Z±S n is the 
set of all finite permutations and do is the metric of the previous paragraph. By a standard 
diagonalization argument, every permutation sequence can be shown to have a convergent 
subsequence (see Lemma [275]) . As a consequence, the metric space (S,da) can be enlarged 
to a compact metric space (Z/^,do) by adding limit permutations, where we identify limit 
permutations that are equal almost everywhere. By Theorem 1 1 . 6 1 the subspace of permutations 
is dense in Z/^; moreover, it is discrete, since a sequence cannot converge to a permutation 
without being eventually constant (Claim I2.4p . Finally, Theorem 11.81 tells us that, when 
restricted to permutations, convergence in this metric space coincides with the concept of 
convergence in Definition II. 2 i 

We have found two essentially different paths for establishing the main results in this paper. 
One of them starkly resembles the work of Lovasz and Szegedy [21] for graph sequences, which 
relies on Szemeredi-type regularity arguments. However, several difficulties of technical nature 
arise, as the limit objects here are more constrained than in the graph case. For a detailed 
account of this approach, we refer the reader to [T7], while most of its permutation regularity 
ingredients may be found in |15j . 

In this paper, we have opted for an alternative approach, with a distinctive probabilistic 
flavor, which has the advantage of being both more compact and more direct, since several of 
the technicalities of the first approach can be avoided. 

The remainder of this paper is structured as follows. In Section [2] we collect some pre- 
liminary results from probability theory. For instance, we discuss the relation between limit 
permutations and probability measures on the unit square, and we recall some facts about 
weak convergence of probability measures in this context. Moreover, we prove some simple 
facts about the convergence of permutation sequences. Section [3] is devoted to a discussion of 
Z-random permutations and of the probabilistic meaning of subpermutation densities. Sec- 
tion E] deals with the rectangular distance on S and Z, and we prove that a large Z-random 
permutation is close to Z in the tin-distance with high probability. In Section[5]we define three 
natural, different notions of convergence on Z, and prove that they are all equivalent. This 
is then used to prove Theorems 11.61 11.71 and 11.81 For completeness, we provide an appendix 
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containing the proof of an auxiliary probabilistic lemma that turns out to be important in 
our proofs, but which may be proven by standard measure-theoretic arguments. 

2. Preliminaries 

The present section introduces the main concepts of probability theory that are important 
in this paper, and gives the proofs of two simple remarks about the convergence of permu- 
tation sequences. It is organized as follows. Subsection 12.11 deals with relevant properties of 
probability distributions on the unit square. In Subsection 12.21 we recall useful facts about 
weak convergence of probability measures on compact subspaces of M. d , while in Subsection 
12.31 we relate limit permutations to probability distributions on the unit square using the 
notion of regular conditional probability distributions. The remarks about the convergence 
of permutation sequences are addressed in Subsection 12. 41 

2.1. Probability distributions on the unit square. If A C is an event in a probability 
space, we denote by 1L4] the indicator of the event, i.e., the random variable which takes 
value 1 if A occurs and value if A does not occur. 

We shall work with random variables (X,Y) that take values in the unit square [0, l] 2 : 
X is the horizontal and Y is the vertical coordinate of the random point (X, Y). The joint 
distribution of (X, Y) can be represented in many different ways. 

We might define a probability measure fx on [0, l] 2 by defining /i(-B) = P ( (X, Y) € B) for 
every Borel set BC[0,lf, but since the u-algebra of Borel sets is generated by the collection 
of rectangles of the form B = [0,x] x [Q,y], it is enough to specify fJ-(B) for sets of this form 
to define [i uniquely. Thus we define the joint probability distribution function F of (X, Y) by 

F(x,y) = P(X <x,Y <y)=/x([0,x] x [0,y]) , x,ye[0,l\. 

We call the distribution of X and Y the first and second marginal distributions of (X, Y), 
respectively, which satisfy P ( X < x ) = F(x, 1) and P ( Y < y ) = F(l, y). 
If x\ < x 2 and y\ < y 2 then 

P (X 6 (si, x 2 ], Y e (yi, y 2 ] ) = F(x 2 , y 2 ) - F( Xl ,y 2 ) - F(x 2 ,yi) + F{x uVl ). (6) 

The [0, l] 2 -valued random variables that correspond to limit permutations will always have 
the property that both X and Y are uniformly distributed on [0,1], which we denote by 
X, Y ~ Z7[0, 1] . This happens if and only if we have F(x,l) = x, F(l,y) = y for any 
x,y e [0,1]. 

We shall make use of the following simple inequality many times: 

X, Y ~ U[0, 1] => 

V X!,x 2 , yi,y 2 6 [0,1] : \F(x 2 , y 2 ) - F(xt, yi)\ < \x 2 - xi| + \y 2 - yi\ . (7) 

A direct consequence of the fact that X and Y are both uniform is that, for any rectangle 
R= [xi,x 2 ] x [yi,y 2 ] C [0, l] 2 , we have 

P((X,Y) GdR) <P(X = Xl ) + P(X = x 2 ) + P(Y = yi ) + P(Y = y 2 ) = 0, (8) 

where dR denotes the boundary of R. In particular we have 

F(x, y)=P(X <x,Y <y)=P{X < x,Y <y). 

If (Xi,Y\) and (X 2 ,Y 2 ) are both [0, l] 2 -valued random variables then we say that they 
have the same distribution or briefly write (Xi,Y\) ~ (X 2 ,Y 2 ) if their respective probability 
measures fi\ and \i 2 agree on the Borel sets of [0, l] 2 , or, equivalently, if their respective joint 
distribution functions are the same: F±(x,y) = F 2 (x,y) for all x,y G [0, 1]. 
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2.2. Weak convergence of probability measures. Now we recall some well-known facts 
about weak convergence of probability measures (for details, see [3]). 

Let be a complete, separable metric space and let B denote the cr-algebra of Borel sets. 
If ■ ■ ■ and fi are probability measures on the measurable space (O, B), then we say that 

the sequence (/i^^L-^ converges weakly to fi (which we denote fi n => /i) if for all bounded, 
continuous functions / : fl -> R we have 

lim / f(u)dfi n {u)= [ f(u)dn(u). (9) 
n ^°° Jn Jn 

We will make use of the following consequence of Prokhorov's theorem (see Chapter 1, Section 

5 of [3]) characterizing compact subsets of the space of probability measures on Q. 

If Q is compact then every sequence (fj, n )^ =1 has a weakly convergent subsequence. (10) 

For n = R d , if we denote by (X 1 , . . . , X d ) the M. d - valued random variable with distribution 
/j,, and, similarly, we let (X^, . . . , X d ) be the random variable with distribution fi n , then we say 

that {X^, • • • , X d ) converges in distribution to (X 1 , ... , X d ) (or briefly write (X^, X d ) — > 
(X\...,X d )) if fin => p. 

It easily follows from ([9]) that 

(X^,...,X d ) A (X\...,X d ) => y i£[d] : X * n ^X\ (11) 

In other words, weak convergence of the joint d-dimensional distributions implies weak con- 
vergence of the marginal distributions. As a consequence we obtain the following useful fact 
about weak convergence of [0, l] 2 -valued random variables with uniform marginals: 

Vnel : X n , Y n ~ U[0, 1], and {X n , Y n ) A (X, Y) X, Y ~ U[0, 1]. (12) 



The converse implication of (jlip does not hold in general, but it does hold in the special 
case when the marginals are independent. 

Let Cl % denote a metric space for all i € [d], and denote by f2 = J7 1 x • • • x Q d the product 
space (we may define the distance between points of f2 to be the Loo-distance, that is, the 
maximum of the distance of each pair of coordinates). If fj, l n , pH 1 are probability measures on 
Q l then 

Vi € [d] : /J, l n // => /4 x • • • x n d /i 1 x • • • x //, (13) 

where [i 1 x ■ ■ ■ x [i d denotes the product measure of the measures /j 1 , . . . , [i d on the product 
space Q. For the proof of this fact, see Theorem 2.8(h)]. 

We say that B £ i3 is a continuity set of the measure \i if n{dB) = 0. The following 
equivalent characterization of weak convergence is a consequence of the Portmanteau theorem 
(see Theorem 2.1]): 

=^ M fJ-n(B) — > fJ-(B) for all continuity sets L 7 of /U. (14) 

Lemma 2.1. Lei (fi n )^ =1 and fi be probability measures on [0, l] 2 . Denote by (X n ,Y n ) and 
(X,Y) the corresponding [0, l] 2 -valued random variables and by F n (x,y) and F(x,y) the 
corresponding joint probability distribution functions. Assume that for all n € N, we have 
X n , Y n ~U[0,l]. Then 

(X n ,Y n ) -±>(X,Y) \\F n -F\\ 00 = sup \F n (x,y)-F(x,y)\->0. 

x,ye[0,l] 

Proof. In order to prove that \\F n — -F||oo — > implies (X n , Y n ) (X, Y) we need to show ([9]). 
It is enough to prove that, for all e > and for every continuous function / : [0, l] 2 — > R, we 
have 



lim sup 



l fi fi 



f(x,y)dfi n (x,y) - / / f(x,y)dfx(x,y) 
o ./o Jo Jo 



< e. (15) 
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Since [0, l] 2 is compact, / is uniformly continuous, so we can choose k G N such that if we 
define B k (i,j) = x | then we have 

Vi,je[k] : max{/(x,y) : (x,y) £ B k (i,j)} - mm{f(x,y) : (x, y) G B k (i,j)} < e. 
We bound the integral on the l.h.s. of (|15p : 



V 

o Jo 



f(x, y) (dfi n {x, y) - dfi(x, y)) 
k 



- ^2 [ Vn(B k (i,j)) ■ max / - fJ,(B k (i, j)) ■ min / ) 
„~~~, \ B k (i,j) B k (i,j) J 



k k 
i,j=l i,j=l 



\Hn(B k (i, j)) - n(B k (i, j))\ 



4||F n 



With the same arguments, we may prove that 
-1 r i 



/ / f(x,y)(dfi(x,y) -dfi n (x, y)) <e + k 2 \\f Woo -4:\\F n - FWoo, 
Jo Jo 



so that 



i fi 



o Jo 



fix, y) (dfi n (x, y) - dfi(x, y)) 



<e + fc 2 ||/||oo-4||F n -F| 



Now (I15p follows from the above inequality and | \F n — F\ 



0. 



In order to prove that (X n ,Y n ) (X,Y) implies \\F n - F^ -> 0, we will have to make 
use of our assumption that X n , Y n ~ f/[0, 1] for every n (this implication does not hold for 
general sequences of weakly convergent probability measures on [0, l] 2 ). 

We are going to show that 

V k £R 3n eN V n>n : \\F n - F\\ oa = sup \F n (x,y) - F(x,y)\ < -. (16) 

x,y£[0,l] & 

Since X n , Y n ~ ^[0,1] for every n and (X n ,Y n ) — (X, Y), we know from (|12p that 
X,Y ~ U[0, 1]. By JSD, it follows that, for every rectangle R G [0, l] 2 , P ( (X, Y) G dR ) = 0, 
so that rectangles are continuity sets with respect to the associated measure /x. We may then 
apply (fTlj) to deduce that F n (x,y) — > F(x,y) for all x,y G [0,1]. Thus there exists no G N 
such that for all n > uq and i,j G {0, 1, . . . , k} we have 



F, 



« J 



n 1 fc'jfc 



F 



1 L 
fc' A; 



< 



1 



(17) 



In order to deduce (|16p from this, pick x, y G [0, 1] and let £ := \k ■ x\ and j := • yj . Then 

\F n (x,y) -F(x,y)\ 



< 



+ 



* J 



Fn \k'k) F \k , k 



+ 



(pi 



2 



2 03 5 



which concludes the proof. 



(18) 



□ 
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2.3. Regular conditional probabilities. The aim of this subsection is to show that limit 
permutations are (essentially) in one-to-one correspondence with probability measures on 
the unit square with i/[0, 1] marginals: Z is the regular conditional distribution function 
of Y with respect to X. Heuristically we may think about Z as the function satisfying 
Z(x, y) = P (Y < y \ X = x) , although the condition X = x has zero probability. 

Denote by £>[0, 1] the cr-algebra of Borel sets of the unit interval [0, 1] and recall the definition 
of limit permutation, given in Definition 11.31 

Lemma 2.2. Let (X, Y) be a [0, l] 2 -valued random variable satisfying X, Y ~ U[0, 1]. 

(a) Then there exists a limit permutation Z such that 

VB £ B[0, 1], VyG [0,1] : P(X e B,Y <y) = I Z(x,y)t[x G B]dx. (19) 

Jo 

(b) Moreover, if Z is another limit permutation satisfying (|19p then 

[\[3ye [0,1] : Z(x,y)^Z(x iy )]dx = 0. (20) 
Jo 

The proof of Lemma 12.21 is a standard measure-theoretic argument that is almost identical 
to that of Theorem 4 of Chapter II, Section 7 of [23] about the existence of regular conditional 
distribution functions. Nevertheless we give a full proof of Lemma 12.21 in the appendix in 
order to keep the paper self-contained. 

Lemma 12.21 tells us how to obtain a limit permutation Z from a [0, l] 2 -valued random 
variable (X, Y) satisfying X, Y ~ U[0, 1]. Conversely, with a limit permutation Z, we may 
associate such a random variable (X, Y) as follows. 

Definition 2.3. Let Z G Z. We generate the [0, l] 2 '-valued random variable (X, Y) associated 
with Z in the following way. We first pick X ~ U[0, 1], and, given X, the random variable Y 
is generated according to the cdf Z(X,-). 

It is easy to check that with this definition Z is indeed the regular conditional distribution 
function of Y with respect to X in the sense of Lemma 12.21 

The joint distribution function of (X, Y) associated with Z can be expressed as 

F(x,y) = P(X<x, Y <y) # f Z(x,y)dx. (21) 

Jo 

By Definition 11.31 (b) we get F(l,y) = y which is equivalent to Y ~ U[0, 1]. 

2.4. Two simple remarks about permutation convergence. In this subsection, we prove 
two simple facts about the convergence of permutation sequences that were mentioned in the 
introduction. We show that every convergent sequence (<7 n )neN such that \a n \ -/> oo must be 
eventually constant, and we establish that every permutation sequence contains a convergent 
subsequence. 

For the first, we remind the reader that Theorem 11.61 is only stated for permutation se- 
quences (<7 n ) ne N whose lengths tend to infinity, as we claimed that every other convergent 
sequence is eventually constant. In light of this, we prove this claim prior to addressing the 
main results. As before, if a G S n we write |<r| = n. Recall the notion of convergence of 
permutation sequences from Definition 11.21 

Claim 2.4. Let (cr n ) n gN be a convergent permutation sequence such that \a n \ oo. Then 
the sequence (<r ri ) rie N is eventually constant, that is, there is a permutation a and an no G N 
such that n > no implies a n = a. 
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Proof. It follows from ([3]) that ^TeS fc t( T i 7T ) = < \tt\] for any k £ N and any permuta- 
tion 7r. By Definition 1 1 . 2 1 we get that for any fixed k € N the limit limji^oo Y^ T eS ^( r > CJ n ) = 
lim n _ > . 00 t[k < \a n \] exists and must be equal to or 1. 

From this and our assumption that lim inf n ->oo \cr n \ < oo one deduces that liminf.n_i.oo \o~ n \ = 
limsup n _ !>00 \o~ n \; thus there is some m € N such that \a n \ = m if n is large enough. 

Now if r, 7r S S'm then t(r, it) = l[r = 7r], and hence lirnrc-^oo £(r, a n ) must be equal to 
or 1 for all r 6 5 m . From this and jSml = m! < oo it is straightforward to deduce that the 
sequence (<7 n ) n6 N is eventually constant. □ 

To conclude this section, we show that permutation sequences always contain convergent 
subsequences. 

Lemma 2.5. Every permutation sequence has a convergent subsequence. 

Proof. Let (<r n ) ng N be a permutation sequence. We shall find a convergent subsequence of 
(<7 n ) n6 pj by a standard diagonalization argument. Since S n is finite for every n > 1, the set 
of S = (J^=i of all finite permutations is countable, say 5 = (r m ) m£ N. 

If (fn)neN does not converge, starting with n, we let (a*) ne N be a subsequence of (<x«)neN 
for which the bounded real sequence (£(ti, <7^)) ne N converges. Inductively, for m > 2, we 
let (<7™) ne N be a subsequence of (cr™ _1 ) n6 N such that (£(r m , cr™ _1 )) ne M converges. It is now 
easy to see that the diagonal sequence (<r™) n6 N is such that, for every positive integer m, 
the sequence (t(r m , o"™)) ne N converges. In other words, the sequence (cr^) ne N is a convergent 
subsequence of (cr n ). □ 

3. Z-RANDOM PERMUTATIONS AND SUBPERMUTATION DENSITIES 

In this section we define the concept of a random subpermutation a(k, w) of length k of a 
permutation 7r and relate the subpermutation densities t(r, 7r) to the distribution of a(k,ir). 
Analogously, given a limit permutation Z we define the Z-random permutation a(k, Z) of 
length k and the subpermutation densities t(r,Z). In order to treat permutations and limit 
permutations in a unified way we assign a limit permutation Z a to every permutation a in 
such a way that their subpermutation densities are close to each other. 

First we recall Definition I \.\\ the definition of the density t(r, n) of t G Sk in 7r € S^. Now 
we give a probabilistic interpretation of this quantity. 

Definition 3.1. Let it E S n . For k <n, the random subpermutation a(k,ir) ofir of length k 
is the random element of Sk generated in the following way. Choose an element (A^ , . . . , X£) € 
[n]< uniformly from the (^) = |[n]^-| possibilities and let a = a(k,Tr) be the permutation of 
[k] satisfying 

Vt, j € [k] : 7r(Af*) < 7T(Af;) (7(i) < 

In plain words: cr(A;, 7r) encodes the relative order of (7r(^f 1 *), . . . , It is obvious from 

Definition 11.11 that we have 

VreSfc : P(<r(fc,7r) =r) =*(t,7t). (22) 

Definition 3.2 (Z-random permutation (Version 2)). Given a limit permutation Z and a 
positive integer n, a Z-random permutation o~(n,Z) is a permutation of [n] generated as 
follows. Recall Definition \2.3\ and let (X\, Y\), . . . , (X n , Y n ) be independent and identically 
distributed [0, l] 2 -valued random variables with distribution associated with Z. These pairs 
define the permutations R, S G S n , where 

R(i) = \{j : Xj < Xi}\, S(i) = \{j : Y j < Y}\. (23) 

The random permutation a = o~(n, Z) is given by o~(n, Z) = SoR^ 1 ; that is, o~(i) = S(R~ 1 (i)) 
for every i € [n] . 
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Note that Z-random permutations are well-defined with probability one, because the prob- 
ability that either (X\, . . . ,X n ) or (Yi, . . . ,Y n ) has repeated elements (i.e., S or R is not a 
well-defined permutation) is zero. This follows directly from the fact that both (X\, . . . ,X n ) 
and (Yi, . . . ,Y n ) are independent and uniformly distributed on [0, 1]. It is easy to see that 
o~(n, Z) in Definition 13.21 is equivalent to the one given in Definition 1 1.4[ since the X* and Y* 
defined in the latter may be related to X and Y by X R -i^ = X* and Yj = Yguy Moreover, 
observe that one could alternatively define a random permutation with the same distribu- 
tion as cr(n, Z) by first generating a sequence (X^, . . . , X*) uniformly distributed on [0, 1]" 
and then drawing each Yj in [0, 1] independently according to the probability distribution 
induced by Z(X*, ■). The random permutation a(n, Z) is given by the order of the elements 
in (Yi, . . . , Y n ). This definition of cr(n, Z) resembles Definition 13.11 

Definition 3.3. Given a limit permutation Z and r £ Sk the density of r in Z is given by 

t(T,Z) = P(a(k,Z) = r). (24) 

It will be convenient for us to associate a limit permutation Z a with every permutation a 
in such a way that t(r, Z a ) of (|24p is close to t(r, a) of (|22p . 

Definition 3.4. Given a permutation a £ S n define Z a £ Z by defining the associated [0, l] 2 - 
valued random (X a ,Y a ) to have joint density function 

fa{x,y) = n ■ 1[ a(\n ■ x\) = \n ■ y] ). (25) 

Note that the support of the density function f a (x,y) "looks like" the matrix A a = 
(^o-(i)j)- _ x OI " the permutation a, the row and column sums of which are all equal to 1. 
Prom this it easily follows that we have X a ,Y a ~ U[0, 1]; thus, by Lemma 12.21 we can define 
Z q almost surely uniquely. Indeed, 

rv 

Z a (x,y) = / f a (x,y)dy. (26) 



J o 

Lemma 3.5. Let r £ S^, o~ £ S n , k < n. Then 

\t(r,a)-t(T,Z a )\<±Q. (27) 

Proof. If A, B are events in a probability space then it is easy to check that 

\P(A)-P(A\B)\<1-P(B) = P(B C ). (28) 

We are going to apply this inequality to prove (p7|) . To this end, let (Xi, Y), i = I, ■ ■ ■ , k, be 
i.i.d. with the same joint distribution as (X a , Y a ) of Definition 13.41 Then Xj, i = 1, . . . , k, are 
i.i.d. and uniform on [0, 1]. 

Let A be the event that the relative order of the vertical coordinates of (Xi, Yi), i = 1, . . . ,k 
is t with respect to their ordered horizontal coordinates, i.e., with the notation of (|23p . let 
A = {t = SoR- x }. 

Define Xi := \n ■ Xi] for each i £ [k]. We define the event 

B = {Vl<i<j<k : Xi^Xj) (29) 

and let (X£,X%,... ,X£) denote the /c-tuple that we get by arranging (X\, X2, ■ ■ ■ , X^) in 
increasing order. Note that \n ■ Yj] = cr(Xi) by (|25p . and that under the condition that the 
event B occurs the random /c-tuple (X±, X£, ■ ■ ■ , X£) is uniformly distributed on [n]<. Thus 
we have 

P(A)^t(r,Z a ), P(A\B)^t(r,a) and P (B c ) < i 
and our result follows from (i28l). □ 
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4. Rectangular distance 



In this section we introduce the notion of rectangular distance d\j(ai,a 2 ) between two 
permutations and also give the analogous definition du(Z\,Z 2 ) of rectangular distance be- 
tween two limit permutations. Using the notion of the limit permutation Z a assigned to a 
permutation a, we make a connection between these two definitions of d\j. 

In Subsection 14.21 we prove that the rectangular distance between Z and the random sub- 
permutation o~(k, Z) is small with high probability if 1 <§C A:. In other words, we show that Z 
can be recovered from the sample cr(k, Z) with small error. From this we derive an analogous 
statement which quantifies how small dn(7r, cr(k, tt)) is. This will prove to be useful in the 
characterization of testability in |16j . 



4.1. Preliminary facts about the rectangular distance. Let I[n] be the set of all inter- 
vals in [n], that is, the set of all subsets of the form {x G [n] : a < x < b}, where a, b € [n + 1] 
are called the endpoints of the interval. Given a permutation a on [n], Cooper [7] defines the 
discrepancy of a as 

I S1ITI 

\a{S)DT\ 



D(a) 



max 

S,TeI[n] 



n 



(30) 



This is used to measure the "randomness" of a permutation. Indeed, sequences with low 
discrepancy, i.e., for which D(a) = o(n), are said to be quasi-random. We use a normalized 
version of the same concept to introduce a distance between permutations. 

Definition 4.1. Given permutations o~\,o~2 £ S n , the rectangular distance between a\ and 02 
is given by 

1 



^□(01,02) 



— max 

n s,Tel[n] 



|<ti(S) nr| - \a 2 (S) nr| 



(31) 



An analogous metric du may be defined to measure the distance between limit permu- 
tations. Given Z\,Z 2 £ Z, and the associated [0, l] 2 -valued random variables (Xi,Y±) and 
(X2, Y2) (see Definition 12.3ft . the rectangular distance between Z\ and Z2 is defined by 



d n (Zx,Z 2 ) 



sup 

xi<x 2 e[o,i] 
j/i<s/ 2 e[o,i] 



•t'2 



(Zi(x,y 2 ) - Zi(x,yi))<ix 



■1:2 



{Z 2 {x,y 2 ) - Z 2 (x,y 1 ))dx 



sup 

x 1 <x 2 &[0,l] 
2/l<2/2S[0,l] 



P(X 1 e[x 1 ,x 2 ], Y 1 e[y 1 ,y 2 })--p(X 2 e[x 1 ,X2],Y 2 e[y 1 ,y 2 }) . (32) 



Denote by F\ and F 2 the joint probability distribution function (see ([21]) ) of (Xi,Y\) and 
(X2,Y2), respectively. We define 



doo(Z\, Z2) 
Using identity 



\F-\ 



2 00 



sup 

x,y£[0,l] 



P(X 1 <x, Y 1 <y)-P(X 2 <x, Y 2 <y) . (33) 



it is easy to deduce that 

d O0 (Z 1 ,Z2)<d n (Z 1 ,Z2)<4-d O0 (Z l ,Z 2 ). (34) 
In the sequel we will use the simpler doo in our proofs rather than da. Note that we have 

d n (Z 1 ,Z 2 ) = ^ d oo (Z 1 ,Z 2 ) = ^ F l = F 2 ^ 

/ l[Z 1 (x,-) = Z 2 {x,-)]dx = l. (35) 
J 

Recalling Definition 13.41 it is not hard to see that, for permutations a'1,0'2 £ S n , we have 
da(ai, 0-2) = da(Z ai , Z a2 ), which allows us to extend the definition of rectangular distance to 
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permutations on different sets of integers. Indeed, we may define dn(cr, 7r) := du(Z a , Z^) for 
every pair of permutations a, 7r € S. 

Similarly, we define the rectangular distance of a permutation a and a limit permutation 
Zby 

d D (a,Z) :=d n (Z a ,Z). (36) 

With this definition we may express the discrepancy in (|30p as D(o~) = n ■ d^(cr,Z u ), where 
Z u is the uniform limit permutation defined on page |U 

4.2. Rectangular distance and subpermutations. The main objective of this subsection 
is to show that the rectangular distance between a limit permutation and a large constant-size 
random subpermutation a(k,Z) of it is small. 



Lemma 4.2. If k E N is a sufficiently large integer and Z € Z, then 

P ( d u (Z, a(k, Z)) < 16AT 1 / 4 ) > 1 - \ e ~ Vl - 



(37) 

Proof. As in ()2ip . denote by F the joint probability distribution function of the random 
variable (X, Y) associated with the limit permutation Z (see Definition [2]3]). Let &k = o~(k, Z) 
and, recalling (|25h . we define the random functions fk and Fk by 

rv r x 

fk(x,y) = n ■ l[a k (\n ■ x]) = \n ■ y] ], F k (x,y) 



o Jo 



f k (u,v) dv du. 



By (J36J) , (|33j) and (|34j) , we only need to prove that if k is large enough then 



sup 



\F(x,y)-F k (x,y)\>4k- 1 / i ^j < X -e 



Vk 



F h 



As in (fTHj) . for i = [k-x\ and j = [k-y\ , we have |-F(x, y) — F k {x,y)\ < t+ 
We claim that, in order to show (I39p . we only need to show that for large k we have 



pill 



max 

ij6[fe] 



1 1 



1 1 



> 3AT 1 / 4 ) < 6e~ 2 ^. 



(38) 

(39) 
i 

ki k) 

(40) 



This is because 4/k < k 4 / 4 for large and 




3AT 1 / 4 



< 



■2y/k 



{12k 2 e 



-Vk 



Now I2k 2 e~^ < 1 if k is large enough, and we get (|39p . 

To establish (|40p . let (Xi, Yi), . . . , (Xk,Yk) be i.i.d. [0, l] 2 -valued random variables with dis- 
tribution function F(x, y). Denote by (X{ , . . . X^) and (Y{, . . . Y k *) the values of {Xi, . . . , X k } 
and {Yi, . . . , Yk}, respectively, rearranged in increasing order. 

Then we have 

j A li^ihio < j] = i£ipQ < x;,y, < rp. 
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In order to prove (|40p we first show that 

Vi,i€[fc] : p(^Y^l[X l <X*,Y l <Y*]>F(^,^+3k- 1 A <3e~ 2 ^. (41) 

Set e = AT 1 / 4 . By © we get 

Let us define the event 

A --=[lflnx l <X;,Y l <Y*}>F^+e, J -+e^+ey 

Then by (|42p the probability on the left-hand side of (|4ip is less then or equal to P ( A ) . On 
the other hand, 

p(An{x*<^ + £ }n{Y/ <! + £ }) +P^n({x*>^ + £ }u{F/>| + £ })) 

< P (An {X* < 1 + e} n {Y/ < i + e}^ +P (x* > ^+ e^j +P \ Y* > ^ + e) . (43) 

In order to bound these probabilities we are going to use the following large deviation es- 
timate on binomial random variables, which is a consequence of Hoeffding's inequality (see 
McDiarmid [22]). 

S^Bm(kp) =► fP(^>P + e)<ex P (-2^) 

1 W \P(^<p-e) <exp(-2fe 2 ). 1 ] 



This inequality will be used to bound the three terms on the r.h.s. of ([4311 . 



and, similarly, it holds that P ^Y^* > | +e) < exp(— 2ke 2 ). Thus it only remains to notice 
that 

P^An{X*<~+e}n{Y* < 

< - + y, < | + e] >F^- + £ ,^ + £ J +£ j ~ ex P(" 2fe2 )- 



\ i=i 

Thus by (|4"5j) and the above applications of (|4"4"]1 we get (HU, namely 

P(A) <3exp(-2te 2 ) = 3exp(-2A:(^ 1 / 4 ) 2 ) = 3e- 2VI . 

The inequality 



Vi,je[fc] : P^^i[x,<x;,y<y/]<F^,^-3A;- 1 / 



1/4 ) < 3e -2v^ 

can be proven analogously. Putting this and (|4"Tj) together we get (|1D]) . □ 
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Lemma 4.3. Let k be a sufficiently large positive integer. Let n > e 2v ^ and it G S n . Then 
we have 

P ( dn(Tr, o-{k, vr)) < 16AT 1 / 4 ) > 1 - e _V *. (45) 

Proof. The proof uses ideas similar to the ones in the proof of Lemma 13.51 

Given the permutation it, define Z n by Definition 13.41 The random permutations cr(k, n) 
and a(k, Z w ) do not have the same distribution, but, if (X\, Yi), . . . , (X^, Y^) are i.i.d. [0, re- 
valued random variables associated with Z^ (see Definition 12. 3p . then 

(i) the distribution of o~(k, Z v ) is the same as that of S o R- 1 (defined for Z^ in (|25]l). 

(ii) the distribution of cr(k, 7r) is the same as that of S o i? -1 under the condition that the 
event B defined in ([29]) occurs. 

It follows from (j28|) that for any event A we have 

\P(A)-P(A\B)\<P(B C )<±(^ < e'^ Q § i e -V5, 

where the inequality (*) holds if k is large enough. If we let A = {d u {Z w , S o R" 1 ) < 16k- 1 / 4 } 
then we have P ( A ) > 1 — e~^/2 by Lemma [4. 2 i Now, for k large enough, ([45 p follows from 

P ( d a (TT, a(k, tt)) < 16/fc~ 1/4 )=P(^|S)>P(^)-P( J B C )> l-ie-^-ie-^ > l-e _>/ * 

□ 



5. Limits of permutation sequences 

The objective of this section is to use the machinery developed in the previous sections 
to prove our main results. In Subsection 15.11 we define three different types of convergence 
on the space of limit permutations Z (weak convergence, tig-convergence and convergence 
of subpermutation densities) and prove that they are all equivalent. This is then used in 
Subsection 15.21 to prove the theorems stated in the introduction. 

5.1. Equivalence of different notions of convergence of limit permutations. First we 
prove that a limit permutation is uniquely determined by the collection of its subpermutation 
densities (up to the almost sure equivalence in ([35]) ). Recall that we denote by S = (Ji^i &n 
the set of all finite permutations. 

Lemma 5.1. Let Z,Z € Z. Denote by (X,Y) and (X,Y) the [0, l] 2 -valued random variables 
associated with Z and Z (see Definition \2. 3[) . Then we have 

Vr eS : t{r,Z) =t(r,Z) =► (X,Y)~(X,Y). 

Proof. Denote by F the joint probability density function of (X, Y) (see (|2ip ). By (|35[) it 
suffices to prove that, if we know (i(r, Z)) T&S , then we also know the value of F(x,y) for 
every x, y G [0, 1]. 

As a matter of fact, given (t(r, Z)) tGS we know the distribution of the random permutation 
o~(k, Z) for every k G N by (]24p . and, from a(k, Z), we can define the random function Fk as 
in ([38]) . Thus we can calculate the expected value E ( Fk(x, y) ) given (i(r, Z)) reS . As in the 
proof of Lemma |4.2[ it follows that (|39|) holds, which implies that for every x,y G [0, 1] we 
have lim^oo E ( F k (x, y) ) = F(x, y). 

In other words, we have shown that, given (t(r, Z)) tgS , the value of F(x, y) can be recovered 
for every x, y G [0, 1]. □ 

Recall the definition of — — )■ from Subsection 12.21 and the definition of da(Z\, Z^) from (|32p . 
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Definition 5.2. Let Z, Zi,Z 2 ,... be limit permutations. Denote by (X n ,Y n ) and (X, Y) the 
[0, l] 2 -valued random variables associated with Z n and Z, respectively. We say that 

1. Z n Z if(X n ,Y n ) A (X,Y) holds; 

2. Z n — > Z if lim n ^oo d a (Z n , Z) = 0; 

3. Z n — > Z if V t 6 S : lim n _ 5 . 00 t(T,Z n )=t(T,Z). 

Lemma 5.3. Let Z, Z\, Z2, ■ ■ ■ be limit permutations. The three notions of convergence of 
Definition \5.2\ are equivalent, that is we have 

Z n =^ Z •<=>■ Z n — > Z •<=>■ Z n — > Z. 

Proof. As before, denote by F n and F the respective joint probability distribution functions 
of (X n ,Y n ) and (X,Y), following $21$. 

We first address the equivalence of Z n => Z and Z n Z. On the one hand, Z n =$> Z and 

(X n ,Y n ) (X,Y) are equivalent by definition, and it holds that X n ,Y n ~ U[0, 1]. On the 

other hand, by (j33[) and ()34|) . we have that Z n Z is equivalent to \\F n — -F||oo 0- The 

equivalence of Z n =4> Z and Z n Z is now a direct consequence of Lemma 12.11 

Now we are going to prove that Z n => Z and Z„ — ^-?> Z are equivalent. 
First we assume that Z n =>• Z holds. In order to deduce Z n Z from this, we need to 
show that 

Vt g S : lim i(r,Z n ) = t(r,Z). 

n— >oo 

Fix A; and r € Sfe. Let (X^Y*), i = l,...,k, be i.i.d.with the same joint distribution as 
(X n , Y n ), while (X l ,Y l ), i = 1, . . . , k, are i.i.d. with the same joint distribution as (X, Y). 
It follows from Z n => Z and (|13p that 

(X*,^)). A ((A-,n). asn^oo, (46) 

i.e., the sequence of [0, l] 2fe -valued random variables ( (Xi t YS) ) , n = 1, 2, . . . , converges 

\ / ie[k] 

in distribution to the [0, l] 2fc -valued random variable ( (X i ,Y i ) ) 

V / ie[k] 

Let A T C [0, l] 2fc denote the event that the relative order of the vertical coordinates of 
(x l ,y l ), i = 1, . . . , k, is r with respect to their ordered horizontal coordinates. Then 



{X t n ,YZ)) ie[k] eA r \=t(r,Z n ) and P U(X\Y*)) ^ e A T ) = t(r, Z). (47) 

In order to deduce t(r,Z n ) — > t(r,Z) from ([4"7]) we apply (fl4"|) . We only need to show that 

the boundary of A T has measure with respect to the distribution of I (X 1 ,Y % ) ) , which 

V /ie[k] 

is indeed true because, on the boundary of A T , either x l = x 3 or y l = y J holds for some 
i 7^ j, but this has zero probability since both (X 1 , . . . ,X k ) and {Y\...,Y k ) are i.i.d. and 

U[0, l]-distributed. Thus we have proven that Z n Z implies Z n — — > Z. 

Now we assume that Z n — Z holds and deduce from this that Z n => Z holds. By Q we 
only need to show that for any bounded, continuous function / : [0, l] 2 ->Kwe have 

E(/(A n ,Y n ))^E(/(X,Y)). (48) 

We are going to prove this by contradiction: assume that there is some / such that (|48p 
does not hold. Since |E ( f(X n , Y n ) )| < ||/||oo> we can choose a subsequence (n(m))™ =l such 
that linim-s.oo E ( /(X n ( m ), Y n ( m )) ) = a for some o/E( f(X, Y) ). Now by (fTOl) we can find 
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a second subsequence (m(£)) ( ^ =1 and a [0, l] 2 -valued random variable (X,Y) such that if we 
define (Xe,Yi) := {X n ^ m ^,Y n ^ m ^) then (X e ,Y e ) {X,Y) as I -> oo. In particular, we 
have E(/(X,f)) = a. 

We now look at the properties of X and Y. By (|12p we know that X, Y ~ J7[0, 1]. Using 
Lemma \2.2\ we define Z to be the limit permutation associated with (X,Y). By definition, 
we now have Z[ => Z where Zn := Z n ( m ^, and hence Zi — Z follows from our previous 
argument. On the other hand, from the assumption Z n — — ^ Z, we obtain Zg — — >■ Z . 

However, from Zj — — ^ Z, Z^ — H> Z and Lemma [5 .1\ we obtain (X, Y) ~ (X, K) and, in par- 
ticular, E (/(!,?)) = E(/pT,F)) which contradicts E = «/ E(f(X,Y)). 

Thus we have proven that Z n =>■ Z and Z n — — > Z are equivalent, which completes the proof 
of the lemma. □ 

5.2. Proof of the main theorems. In this section, we shall focus on the proofs of the main 
results of this work, which have been stated in the introduction. We start with a brief overview 
of these results. Theorem 11.61 establishes the correspondence between convergent permutation 
sequences and limit permutations, in the sense that there is a limit permutation associated 
with every convergent sequence and vice-versa. The fact that the limit of a permutation 
sequence is essentially unique is the subject of Theorem 11.71 while Theorem 11.81 relates the 
original concept of convergence of a permutation sequence to convergence with respect to the 
metric do- 
As a consequence of Claim [2741 we shall henceforth restrict our attention to permutation 
sequences (0" n )neN such that \a n \ — > oo. Moreover, as a useful auxiliary fact, we point out 
that it follows from Definitions 11.51 and 13.41 and from Lemma 13.51 that we have 



a n -+ Z <=^> Z(j n - y Z. (49) 

Proof of Theorem \1.6\ (i). Let us define Z n := Z CTn . As usual, we denote by (X n ,Y n ) the 
[0, l] 2 -valued random variables associated with Z n (see Definition l2.3p . which satisfy X n , Y n ~ 
J7[0, 1] for every n. By (jlOp there is a [0, l] 2 -valued random variable (X, Y) and a subsequence 

(n(m))^ =1 such that (X n r m \ , Y n i m ) ) as m oo. Moreover, it follows from (Ti"2"j) 

that we have X, Y ~ U[0, 1]. Using Lemma 12.21 we define Z to be the limit permutation 
associated with (X,Y). With these definitions we have Z n r m \ Z as m —> oo, from which 

Z n (m) Z follows by Lemma [5731 By (j49l) we obtain i(r, 0W m )) — ► t(r,Z) as m — > oo for 
every r. Since we have assumed that t(r, a n ) is convergent (see Definition II. 2p as n — > oo we 
obtain t(r, a n ) — > t(r, Z) as n — > oo, i.e., u n — » Z. □ 

Proof of Theorem ] 1.6\ (ii). The basic idea of the proof comes from [9], where the authors 
suggest that the analogous theorem for graphs and graphons can be proven using reverse 
martingales. 

We jointly define the random permutations (<r n ) ngN in the following way: let (X n ,Y n ), 
n € N be i.i.d. [0, l] 2 -valued random variables associated with Z (see Definition 12. 3p . Let 
o~ n be the relative order of the vertical coordinates of (JQ, Yj), i = 1, . . . ,n with respect to 
their ordered horizontal coordinates (see Definition 13. 2p . Note that, with this definition, o n 
is almost surely a subpermutation of o~ n+ i. 

We are going to show that P ( o~ n — >■ Z) = 1, i.e., that the random sequence of permutations 
(<7 n ) ngN almost surely converges to Z. It is enough to show that, for all r € 5, we have 
P ( t(r, a n ) — >■ t(r, Z) ) = 1. Fix r € Sk and let A T C [0, l] 2fc denote the event that the relative 
order of the vertical coordinates of (x*,y 4 ), i = 1, . . . , k, is r with respect to their ordered 



18 



C. HOPPEN, Y. KOHAYAKAWA, C. G. MOREIRA, B. RATH, AND R. M. SAMPAIO 



J6[fe] 



horizontal coordinates. If k < n then by ([3]) the random variable t(r, a n ) can be expressed as 
*( T . £J n) = -pr 1 

^ (ii,..,i fe )6[n]^ 

From this formula and (I47p . linearity of expectation leads to E ( t(r, a n ) ) = i(r, Z). 

Now P ( t(r, a n ) — > t(r, Z) ) = 1 follows from the strong law of large numbers for U-statistics 
(see Section 3.4 of [18]). □ 



We observe that there is an alternative, more self-contained proof of P(c n — s> Z) = 1, 
with (a n ) as in the proof above. In fact, we may use the Borel-Cantelli lemma to derive 

P ( do(a n , Z) — > ) = 1 from Lemma l4T2l and then apply Lemma l5T3l to derive P ^ Z Un — ^ Z ^ = 

1 from this and finally use (|49p to arrive at P ( <x„ — > Z ) = 1. 

Proof of Theorem \1. 1\ We want to show that a n — > Z\ and a n — >• Z2 implies that the set {x : 
Z\(x, •) ^ ^(x, ■)} nas Lebesgue measure zero. In light of the equivalence of the statements 
in (I35p . this follows from Lemma 1 5. 11 □ 

Proof of Theorem \1.8[ By Claim [2741 we may assume that |cr n | — > 00. 

Let Z n := Z an . It suffices to show that (a n ) ne ^ is a Cauchy sequence w.r.t. do if and only 

if Z n Z for some Z € Z (i.e., the completion of (S,dn) is (Z/^,dn), where ~ is the 
equivalence relation in (|35p ). Indeed, assuming that this is true, Lemma 15.31 implies that the 

fact that (<T n ) ragN is a Cauchy sequence with respect to do is equivalent to Z n — — > Z for some 
limit permutation Z, which is equivalent to a n — > Z by (|49h . as required. 



We now prove the above assertion. It is clear that, if Z n Z for some Z £ Z, then o~ n is 
a Cauchy sequence w.r.t. do. For the converse, let a n be such a Cauchy sequence and denote 
by (X n ,Y n ) the [0, 1] 2 -valued random variables associated with Z n . Denote by F n the joint 
probability distribution function of (X n ,Y n ). By ([33]) and (}3"4"]) we get that F n is a Cauchy 

sequence w.r.t. the Loo-norm. Thus by the completeness of Loo[0, l] 2 , we have F n ^ F for 
some F : [0, l] 2 [0,1]. Now we show that F itself is the joint probability distribution 
function of [0, l] 2 -valued random variable with uniform marginals. 

By (flQ|) there is an (X, Y) and a subsequence (n(m))^ =1 such that (X n (, m ), Y n i m \) 

(X, Y) as m — > 00. By (|12p we have X, Y ~ [7(0,1]. Denote by F the joint probability 

distribution function of (X, Y). By Lemma 12.11 we have -F n ( m ) -F. Comparing this to 

F„ ^ F we get that F = F. In other words, (Z n ) converges to some Z <E Z w.r.t. the 

Loo-norm, and hence Z n —} Z by (|35p . as required. □ 

Acknowledgements: The authors are indebted to an anonymous referee for valuable comments 
and suggestions. 
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Appendix A. Proof of Lemma 12.21 

In this appendix we prove Lemma 12.21 The proof is quite similar to that of Theorem 4 
of Chapter II, Section 7 of [23j . It relies on Kolmogorov's notion of conditional expectation, 
about which we recall some useful facts now. 

Given a measurable space (f2, J 7 ) and an event A G T we denote by \ A = ^[A] the random 
variable which takes value 1 if the event A occurs and if A does not occur. The following 
theorem states the existence and almost sure uniqueness of conditional expectation. For a 
proof, see Section 7 of Chapter 2 of [23] . 

Theorem A.l. Let (f2, J 7 , P) be a probability space and let Q C T be a a-algebra. Consider 
an T '-measurable, real-valued random variable Y with E ( \Y\ ) < oo. 

(i) There exists a Q-measurable, real-valued random variable W such that E(|W|) < oo 
and 

VAeQ : E (Y ■ 1 A ) = E ( W ■ t A ) ■ (50) 

(ii) IfW is another Q-measurable, real-valued random variable such that E ^ \ W\ ^ < oo and 

ViGS : B(Y-t A ) = e(w-1 a ) 
then we have P [w = W^j =1. 

We call any W that satisfies Theorem IA.l( i) the conditional expectation of Y with re- 
spect to Q and let E (Y"| (?) = W. Note that Theorem lA.l( zz) states that the conditional 
expectation of Y w.r.t. Q is almost surely uniquely defined. 
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If X is another real-valued ^-measurable random variable and o~{X) is the u-algebra gen- 
erated by X then we let E ( Y \ X ) = E ( Y \ a(X) ) . Since E ( Y \ X ) is a cr(X)-measurable 
random variable, it follows from Theorem 3 of Section 4 of Chapter 2 of |23j that there is a 
Borel-measurable function ip : M. — > R such that 

E(Y\X)=<p(X). (51) 

For A G J", let P ( A | X ) := E ( t A \ X ) . 

The proofs of the following facts about conditional expectation can be found in Section 7 
of Chapter 2 of [23]: 

Y X <Y 2 => P(E(Fi|g) <E(y 2 |a)) =1 (52) 

y n <y*, E(y*)<oo, y n \y p(E(y n |s) \E(y|s)) = 1, (53) 

where the notation Z n \ Z indicates that (Z n ) is nonincreasing and converges to Z. Recall 
that we write £>[0, 1] for the cr-algebra of Borel sets of the unit interval [0, 1]. Denote by A the 
Lebesgue measure on [0, 1]. We are now ready to give the proof of Lemma l2.2[ 

Proof of Lemma \2.Si Let (O, J 7 , P) be the probability space on which our [0, l] 2 -valued random 
variable (X, Y) lives. Recall that we assume X, Y ~ U[0, 1] and, therefore, for any B G B[0, 1], 
we have P(IeB) = P(Y e B) = \(B). Denote by Q = Q n [0, 1] the set of rational 
numbers in [0, 1]. Thus, Q is a countable, dense subset of [0, 1]. Throughout the proof, y, y\ 
and ij2 denote elements of Q, while x and y denote elements of [0, 1]. 
For x € [0, 1] and y G Q define Z(x, y) to be the function for which 

Z(X,y)W E(1[Y <y]\X) = P (Y <y\X) . (54) 

Note that, given y, the function Z(-,y) is only defined almost surely uniquely with respect to 
the distribution of X (i.e., the Lebesgue measure A). We may assume Z(x, 1) = 1. 
For y G Q and yi < y 2 € Q, define Ay u y 2 G B[0, 1] and C y G B[0, 1] by 

A m,y2 = [ x '■ Z(x,yi) < Z(x,y 2 )} , C y = jz : lirn^ Z fx, y + = Z(x, y)j . (55) 



It follows from 1[F < yi] < 1[F < y 2 ] and ([52]) that P f Z(X,yi) < Z(X,y 2 ) J = 1, which 
implies A(^4y li y 2 ) = 1. Moreover, it follows from 1[Y < y + -] \ l[y < y] and ()53|) that 



P 



( Z(X, y + i) \ Z(X, y) ) = 1, which implies A(Cfc) = 1. 
Define i £ B[0, 1] by 



^ == n n n ^ 

\i/i<y2e(Q / \yeQ 

If x G A then Z(x, -)|q is a monotone increasing, right continuous function. By the countable 
additivity of the measure A we have X(A) = 1. 

Now we construct Z : [0, l] 2 — > [0, 1] by defining Z(x, 1) = 1 for all x and, for y G [0, 1), 
letting 

Z{x,y):=l M ^~ Z{x ^ ^ XeA (56) 
\y ifx^A. y ' 

Now we show that this Z is a limit permutation (see Definition II. 3p and that it satisfies 
Lemma 12.2( a). Note first that it follows from (|54l) and (|56[) that Z is measurable. 

It follows directly from ([56]) that, for all x G [0,1], the function Z(x,-) is a cdf (see @), 
and thus Z satisfies Definition ll.3f a). For all x G A and y G Q we have Z(x,y) = Z(x,y). 
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Next we show that for any B £ B[0, 1] and y £ [0, 1] we have (I19|) : 
l >,.s , ri 



I Z{x,y) l[x£B] dx H = 1 f Z(x,y)t[x £ BnA]dx 
Jo Jo 

IP / inf Z(x, y)1[x £ B C\A]dx = inf / jf)l[a € £ n A] cfe 

7o y<y y<yJo 



inf E f ^)l[lGBni]) A( = 1 inf E f Z(X, y)t [X £ B] 
y<y V ' 2/<j/ V 

^i 23 inf E ( 1[Y < y] ■ 1[X £ B] ) ( = } E ( 1[X £ B, Y < y] ) 

y<y 

P(X £B,Y <y). 

The equalities marked by (*) hold because of the monotone convergence theorem. Our as- 
sumption Y ~ U[0, 1] and an application of (|19p with B = [0, 1] implies that Definition II. 3( 6) 
holds, i.e., Z is indeed a limit permutation. This finishes the proof of Lemma l2.2f a). 

It only remains to prove Lemma l2.2f &). Assume that Z is another limit permutation that 
satisfies (|19j) . For any given y £ Q, define 

Dy = jx : Z(x,y) = Z(x,y)} . 

From the above proof it follows that both Z(X, y) and Z(X, y) satisfy the definition of 
E(l[y < y)\X). Thus by Theorem lATH ii) we obtain P ^ Z(X, y) = Z(X, y) \ = 1, which 
implies \(Dy) = 1. If we define D = f]-Dy then we have X(D) = 1 by countable additivity. 

By Definition 11.3( a) both Z(x,-) and Z(x,-) are right continuous functions for any x, and 
if x £ D then these two functions agree on Q and, therefore, they agree on the whole unit 
interval. This proves that \({x : Z(x,-) = Z(x, •)}) = 1, which is a reformulation of (|20|) . 
completing the proof of Lemma 12.2( 5) . 
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